Pseudocode

Part 1: Simulated Data

  1. Generate 10 sets of simulated data using simdata_gen.m:
    • For each set, create 100 “scans”, of which 50 are unique subjects with 2 samples each
      • Do this by creating 200 timesteps for each subject scan and taking the first hundred as sample 1 and second hundred as sample 2
      • A is first generated from a random matrix, then elements with small absolute values are then truncated such that 20 percent of elements are zeros. After that a multiple of the identity matrix is added to A. A is then scaled to make sure its eigenvalues fall within [-1, 1] to avoid diverging time series. (Source)
      • simdata_gen(p=‘10’,d=‘10’,T=‘200’,maxiter=‘20’,option=‘1’) where
    • p is the number of output dimensions
    • d is the number of observed dimensions
    • T is the number of timesteps
    • maxiter is the maximum number of iterations for EM
    • option describes the desired A and C matrices:
      • option = 1: sparse C, regular A
      • option = 2: smooth C, sparse A
      • option = 3: smooth & sparse C, regular A
  2. For each set of generated simulated data, run parameter estimation using given scripts in PLDS package (simulation0, simulation1, etc.)

  3. Run Eric’s discriminability script on actual A matrices of simulated data (from step 1)

  4. Run Eric’s discriminability script on estimated A matrices of simulated data (from step 2)

Part 2: Real Data

  1. For each scan in BNU1 dataset, run parameter estimation with the following inputs:
    • Y = subject scan
    • A = identity
    • C = identity
    • Q = identity
    • R = identity
    • Pi = first column of subject scan (ROI values at first time-step)
    • V = identity
    • Tolerance = 1e-6
    • Iterations = 20
    • Output = estimated [a, c, q, r, pi, v]
  2. Run Eric’s discriminability script on estimated A matrices of BNU1 dataset (from step 1)

Simulation

Good performance

The application of discriminability to KFS parameter estimation should run well on the simulations provided in the PLDS package. The actual A matrices generated by the given scripts all have perfectly separated inter- and intra-subject distances, meaning the MNR is 1.

Bad performance

TBD

Analysis

We will qualitatively assess the results by looking at the MNR and distance plots. The distances should be completely separated for the simulation data. For real data, the farther apart the distances the better.

We will quantitatively assess the results by looking at the MNR value. The MNR value should be exactly 1 for the actual A matrices and close to 1 for the estimated A matrices in the simulations. For real data, the MNR value should ideally be better than the MNR of just the vanilla time-series.

The summary plot visualization for the simulated data will be a bar graph showing the MNR values for the actual and estimated A matrices for each simulation.

Code

Good Simulation Data

We can plot the actual A matrices using square heatmaps. We would expect the main diagonal of the map to be red (corresponding to higher values) because each ROI depends mostly on itself. The rest of the map would be some combination of blues.

Simulation 9, Subject 1 Actual A Heatmap

Simulation 9, Subject 1 Actual A Heatmap

The plots above are in line with our expectations.

We expect the MNR of the actual A matrices to be 1, because they are generated as such. The distances of the actual A matrices should be perfectly separable. We expected the MNR of the estimated A matrices to be close to 1 because the KFS parameter estimation algorithm is purported to be accurate. The distances of the estimated A matrices should be almost perfectly separable.

Simulation 0 Actual

Simulation 0 Actual

Simulation 0 Estimated

Simulation 0 Estimated


Simulation 1 Actual

Simulation 1 Actual

Simulation 1 Estimated

Simulation 1 Estimated


Simulation 2 Actual

Simulation 2 Actual

Simulation 2 Estimated

Simulation 2 Estimated


Simulation 3 Actual

Simulation 3 Actual

Simulation 3 Estimated

Simulation 3 Estimated


Simulation 4 Actual

Simulation 4 Actual

Simulation 4 Estimated

Simulation 4 Estimated


Simulation 5 Actual

Simulation 5 Actual

Simulation 5 Estimated

Simulation 5 Estimated


Simulation 6 Actual

Simulation 6 Actual

Simulation 6 Estimated

Simulation 6 Estimated


Simulation 7 Actual

Simulation 7 Actual

Simulation 7 Estimated

Simulation 7 Estimated


Simulation 8 Actual

Simulation 8 Actual

Simulation 8 Estimated

Simulation 8 Estimated


Simulation 9 Actual

Simulation 9 Actual

Simulation 9 Estimated

Simulation 9 Estimated

As expected, the MNR values for the actual A matrices are 1, and the distances appear to be perfectly separable. However, the results for the estimated A matrices were not as we expected. The estimation algorithm seems to have performed poorly, as we get very poor discriminability and distance separation.

Below, we plot the summary statistics: